Selectivity Options for Age-Structured Stock Assessments
A comparative framework using RTMB
Selectivity is a fundamental component of age-structured stock assessment models, governing how fishing mortality and survey catchability vary with age or size. The choice of selectivity parameterization affects estimates of population abundance, fishing mortality, and management reference points. This manuscript compares several selectivity formulations—including standard logistic, double logistic, spline-based, and time-varying 2D autoregressive approaches—within a unified RTMB framework. We examine parameter identifiability, prior sensitivity, and the practical consequences of selectivity misspecification using simulation and application to eastern Bering Sea walleye pollock (Gadus chalcogrammus). Results highlight trade-offs between flexibility and estimability and provide guidance for practitioners choosing among selectivity options in operational assessments.
selectivity, stock assessment, RTMB, fisheries, walleye pollock
0.1 Introduction
Selectivity functions describe the relative vulnerability of fish to capture as a function of age (or size) and are among the most influential—yet least observable—components of stock assessment models. The assumed shape of the selectivity curve directly affects estimates of spawning biomass, recruitment, and fishing mortality, and misspecification can propagate into biased reference points and management advice (Punt, Hurtado-Ferro, and Whitten 2013; Thompson 1994).
Despite its importance, selectivity is often treated as a modelling convenience rather than a biological or technological quantity to be estimated carefully. Most operational assessments adopt a logistic or double-logistic functional form, chosen for parsimony rather than fidelity to the underlying catch process. More flexible alternatives—such as penalized splines or non-parametric approaches—can reduce structural bias but may introduce identifiability problems, particularly when data are sparse or conflicting.
This manuscript develops a comparative framework for evaluating selectivity options within the R Template Model Builder (RTMB) environment. RTMB provides automatic differentiation, Laplace approximation for random effects, and integration with Bayesian sampling via SparseNUTS, making it a natural platform for exploring both maximum likelihood and posterior-based inference on selectivity parameters.
We organize the comparison around four selectivity families:
- Standard logistic (Section 0.4): the two-parameter ascending logistic, widely used for fisheries where retention is assumed to be monotonically increasing with age.
- Double logistic (Section 1.5): a dome-shaped three-parameter formulation allowing selectivity to decline at older ages, motivated by gear avoidance, ontogenetic habitat shifts, or differential availability.
- Spline-based (Section 2.9): a penalized B-spline approach offering flexible, data-driven selectivity shapes with smoothness controlled by a penalty parameter.
- Time-varying 2D AR1 (Section 3.6): a separable autoregressive structure over year and age dimensions, allowing selectivity to evolve smoothly through time while maintaining age-specific coherence via a Kronecker-structured precision matrix.
For each family, we present the mathematical formulation, an RTMB implementation with prior specification, parameter sensitivity analysis, and MCMC-based posterior evaluation. We then apply all to eastern Bering Sea walleye pollock data to illustrate practical trade-offs.
0.2 Review Scope and Literature Context
This review focuses on methodological literature that is directly relevant to three practical assessment decisions: (i) choice of selectivity functional form, (ii) model selection among competing selectivity hypotheses, and (iii) construction of time-varying selectivity processes with explicit regularization.
0.2.1 Selectivity form choice in integrated assessments
Operationally, selectivity in integrated assessments is best interpreted as the combined effect of gear contact/retention and fish availability, not as a purely mechanistic gear curve (Punt, Hurtado-Ferro, and Whitten 2013; Privitera-Johnson, Methot, and Punt 2022). In that setting, the choice of selectivity parameterization is consequential because it changes inference on spawning biomass, recruitment, and fishing mortality.
Selectivity-form selection therefore requires both statistical and biological diagnostics, including residual pattern checks, sensitivity analyses, and information-based comparisons among plausible alternatives (Punt, Hurtado-Ferro, and Whitten 2013). A key warning is that non-monotone selectivity can be strongly confounded with natural mortality, particularly when older ages are weakly informed by composition data (Thompson 1994).
0.2.2 Gear-selectivity estimation foundations
Classical selectivity-estimation literature remains important for assessment modeling because it clarifies what selectivity data can identify and where latent availability effects remain unresolved (Millar and Fryer 1999; Wileman et al. 1996). This evidence base supports explicit regularization whenever flexible age-specific selectivity surfaces are estimated.
0.2.3 Time-varying selectivity literature
Simulation studies show that time-varying selectivity can materially improve or degrade assessment performance depending on how process flexibility matches the data-generating mechanism (Linton and Bence 2011). Practitioner guidance also emphasizes that subjective choices such as block boundaries, smoothing penalties, and process assumptions can dominate outcomes (Martell and Stewart 2014).
Recent semi-parametric approaches formalize autocorrelation across both age and time, motivating separable latent-process structures that can be estimated within modern AD frameworks (Xu et al. 2019). Broader SCA model-selection and time-varying process work reinforces the same bias-variance trade-off (Wilberg and Bence 2006, 2008).
0.3 Statistical Framing as a Latent Process
Although selectivity is often presented as a deterministic curve, it is more usefully viewed as a latent process that allocates fishing intensity over age and time. Under this framing, selecting a functional form is equivalent to selecting a prior precision structure on an unobserved selectivity surface.
Low-dimensional parametric forms (for example, logistic and double logistic) impose strong structural priors and are robust when data are sparse. Flexible forms (for example, spline and time-varying AR structures) reduce structural bias risk, but require explicit regularization to prevent confounding and overfitting (Martell and Stewart 2014; Xu et al. 2019).
This perspective motivates three principles for applied assessments:
- Match effective selectivity complexity to data information content.
- Treat regularization (penalties/priors/precision structures) as essential, not optional.
- Use model-selection tools (AIC, DIC, cross-validation, posterior predictive checks) to evaluate bias-variance trade-offs under explicit prior structure (Punt, Hurtado-Ferro, and Whitten 2013; Wilberg and Bence 2008).
Within this unifying view, logistic, dome-shaped, spline, blocked, and AR(1) approaches differ primarily in the precision matrix implied for latent selectivity deviations, rather than only in curve geometry.
0.4 Standard Logistic Selectivity
1.5 Double Logistic Selectivity
2.9 Spline-Based Selectivity
3.6 Time-Varying Selectivity with 2D Autoregressive Structure
4.7 Comparison and Discussion
The four selectivity families evaluated here span a practical gradient from low-dimensional structural assumptions to high-dimensional latent-process flexibility. The logistic model (Section 0.4; Figure 1) provides an interpretable baseline with strong monotonicity assumptions and minimal parameter burden. The double-logistic model (Section 1.5; Table 1, Figure 2, Figure 4) adds biologically plausible dome-shape behavior, but also increases confounding risk with natural mortality and cohort effects when data support is limited (Thompson 1994).
Spline selectivity (Section 2.9; Figure 10; Figure 11) offers useful middle ground: flexible shape estimation with transparent control of roughness via penalization. In review terms, this is a semi-parametric compromise between rigid parametric forms and fully dynamic latent surfaces. The time-varying 2D AR1 approach (Section 3.6; Figure 12; Figure 13) further extends flexibility by allowing coherent year-by-age evolution with explicit process structure, aligning with recent recommendations for autocorrelated selectivity deviations (Linton and Bence 2011; Xu et al. 2019).
For scientific review and operational assessment use, a defensible workflow is:
- Start with a parsimonious baseline (logistic or double logistic) and diagnose fit deficiencies in composition residuals and retrospective patterns.
- Introduce additional flexibility (spline or time-varying structures) only when diagnostics indicate systematic lack of fit and data support richer structure.
- Compare candidate models with complementary evidence, including information criteria, predictive checks, and sensitivity of management quantities (Punt, Hurtado-Ferro, and Whitten 2013; Wilberg and Bence 2008).
- Report regularization choices and identifiability diagnostics explicitly, because inferred selectivity dynamics are conditional on prior/penalty assumptions (Martell and Stewart 2014; Privitera-Johnson, Methot, and Punt 2022).
Overall, the central trade-off is not whether one curve family is universally “best”, but whether the assumed latent structure is commensurate with the available information and management objectives.